Goto

Collaborating Authors

 adaptive embedding


Efficient Second-Order Online Kernel Learning with Adaptive Embedding

Neural Information Processing Systems

Online kernel learning (OKL) is a flexible framework to approach prediction problems, since the large approximation space provided by reproducing kernel Hilbert spaces can contain an accurate function for the problem. Nonetheless, optimizing over this space is computationally expensive. Not only first order methods accumulate $\O(\sqrt{T})$ more loss than the optimal function, but the curse of kernelization results in a $\O(t)$ per step complexity. Second-order methods get closer to the optimum much faster, suffering only $\O(\log(T))$ regret, but second-order updates are even more expensive, with a $\O(t^2)$ per-step cost. Existing approximate OKL methods try to reduce this complexity either by limiting the Support Vectors (SV) introduced in the predictor, or by avoiding the kernelization process altogether using embedding.


Unveiling the Inflexibility of Adaptive Embedding in Traffic Forecasting

Wang, Hongjun, Chen, Jiyuan, Zhang, Lingyu, Jiang, Renhe, Song, Xuan

arXiv.org Artificial Intelligence

Spatiotemporal Graph Neural Networks (ST-GNNs) and Transformers have shown significant promise in traffic forecasting by effectively modeling temporal and spatial correlations. However, rapid urbanization in recent years has led to dynamic shifts in traffic patterns and travel demand, posing major challenges for accurate long-term traffic prediction. The generalization capability of ST-GNNs in extended temporal scenarios and cross-city applications remains largely unexplored. In this study, we evaluate state-of-the-art models on an extended traffic benchmark and observe substantial performance degradation in existing ST-GNNs over time, which we attribute to their limited inductive capabilities. Our analysis reveals that this degradation stems from an inability to adapt to evolving spatial relationships within urban environments. To address this limitation, we reconsider the design of adaptive embeddings and propose a Principal Component Analysis (PCA) embedding approach that enables models to adapt to new scenarios without retraining. We incorporate PCA embeddings into existing ST-GNN and Transformer architectures, achieving marked improvements in performance. Notably, PCA embeddings allow for flexibility in graph structures between training and testing, enabling models trained on one city to perform zero-shot predictions on other cities. This adaptability demonstrates the potential of PCA embeddings in enhancing the robustness and generalization of spatiotemporal models.


Reviews: Efficient Second-Order Online Kernel Learning with Adaptive Embedding

Neural Information Processing Systems

The paper proposes an efficient second-order online kernel learning mainly by combining KONS and Nystrom method. NOVELTY The novelty is limited on both the methodological and theoretical contributions. The achieved results do not have profound implication for the advancement of theory and practice. WRITING QUALITY The English writing and organization of this paper are relatively good. The reviewer strongly suggests the authors arrange Table 2 in the main paper rather than in Appendix because the experimental results in Table 2 are the core material.


Efficient Second-Order Online Kernel Learning with Adaptive Embedding

Calandriello, Daniele, Lazaric, Alessandro, Valko, Michal

Neural Information Processing Systems

Online kernel learning (OKL) is a flexible framework to approach prediction problems, since the large approximation space provided by reproducing kernel Hilbert spaces can contain an accurate function for the problem. Nonetheless, optimizing over this space is computationally expensive. Not only first order methods accumulate $\O(\sqrt{T})$ more loss than the optimal function, but the curse of kernelization results in a $\O(t)$ per step complexity. Second-order methods get closer to the optimum much faster, suffering only $\O(\log(T))$ regret, but second-order updates are even more expensive, with a $\O(t 2)$ per-step cost. Existing approximate OKL methods try to reduce this complexity either by limiting the Support Vectors (SV) introduced in the predictor, or by avoiding the kernelization process altogether using embedding.